55 research outputs found

    Distributional Learning of Variational AutoEncoder: Application to Synthetic Data Generation

    Full text link
    The Gaussianity assumption has been consistently criticized as a main limitation of the Variational Autoencoder (VAE) despite its efficiency in computational modeling. In this paper, we propose a new approach that expands the model capacity (i.e., expressive power of distributional family) without sacrificing the computational advantages of the VAE framework. Our VAE model's decoder is composed of an infinite mixture of asymmetric Laplace distribution, which possesses general distribution fitting capabilities for continuous variables. Our model is represented by a special form of a nonparametric M-estimator for estimating general quantile functions, and we theoretically establish the relevance between the proposed model and quantile estimation. We apply the proposed model to synthetic data generation, and particularly, our model demonstrates superiority in easily adjusting the level of data privacy

    Joint Distributional Learning via Cramer-Wold Distance

    Full text link
    The assumption of conditional independence among observed variables, primarily used in the Variational Autoencoder (VAE) decoder modeling, has limitations when dealing with high-dimensional datasets or complex correlation structures among observed variables. To address this issue, we introduced the Cramer-Wold distance regularization, which can be computed in a closed-form, to facilitate joint distributional learning for high-dimensional datasets. Additionally, we introduced a two-step learning method to enable flexible prior modeling and improve the alignment between the aggregated posterior and the prior distribution. Furthermore, we provide theoretical distinctions from existing methods within this category. To evaluate the synthetic data generation performance of our proposed approach, we conducted experiments on high-dimensional datasets with multiple categorical variables. Given that many readily available datasets and data science applications involve such datasets, our experiments demonstrate the effectiveness of our proposed methodology

    Causally Disentangled Generative Variational AutoEncoder

    Full text link
    We present a new supervised learning technique for the Variational AutoEncoder (VAE) that allows it to learn a causally disentangled representation and generate causally disentangled outcomes simultaneously. We call this approach Causally Disentangled Generation (CDG). CDG is a generative model that accurately decodes an output based on a causally disentangled representation. Our research demonstrates that adding supervised regularization to the encoder alone is insufficient for achieving a generative model with CDG, even for a simple task. Therefore, we explore the necessary and sufficient conditions for achieving CDG within a specific model. Additionally, we introduce a universal metric for evaluating the causal disentanglement of a generative model. Empirical results from both image and tabular datasets support our findings

    Interpretable Water Level Forecaster with Spatiotemporal Causal Attention Mechanisms

    Full text link
    Forecasting the water level of the Han river is important to control traffic and avoid natural disasters. There are many variables related to the Han river and they are intricately connected. In this work, we propose a novel transformer that exploits the causal relationship based on the prior knowledge among the variables and forecasts the water level at the Jamsu bridge in the Han river. Our proposed model considers both spatial and temporal causation by formalizing the causal structure as a multilayer network and using masking methods. Due to this approach, we can have interpretability that consistent with prior knowledge. In real data analysis, we use the Han river dataset from 2016 to 2021 and compare the proposed model with deep learning models

    Restaurer la crédibilité des fondements juridiques et économiques de la stabilité financière: la nécessité d'incorporation des théories économiques?

    Get PDF
    To what extent can monetary and financial crises and cycles be explained through economic theories? This paper is aimed at highlighting why a reliance on economic theories may be necessary given certain flaws which have been revealed from the recent Financial Crisis. Namely, that economic and legal foundations of financial stability cannot always be considered to be credible. Further, the paper aims to accentuate on why despite the valid argument (that a reference to economic theories may be required to explain causalities of financial and monetary crises), causalities could also be explained from other perspectives – even though these perspectives may sometimes, not be as accurate

    Clustering for Regional Time Trend in the Nonstationary Extreme Distribution

    No full text
    Since the estimation of tail properties requires a stationarity of observations, it is necessary to develop a de-trending method not dependent on underlying distributions for nonstationary hydrological processes. Moreover, de-trending has been independently applied to hydrological processes, even though the processes are observed in geometrically adjacent sites. This paper presents a distribution-free de-trending method for nonstationary hydrological processes. Our method also provides clustered regional trends obtained by sparse regularization in a general distribution. It aggregates the parameter estimation and clustering within a unified framework. In the simulation study, our proposed method has superiority over other compared methods with respect to MSE and variance of coefficients. In real data analysis, the clustered trends of the annual maximum precipitation in the South Korean peninsula are reported, and the patterns of the estimated trends are visualized

    Learning a High-dimensional Linear Structural Equation Model via l1-Regularized Regression

    No full text
    This paper develops a new approach to learning high-dimensional linear structural equation models (SEMs) without the commonly assumed faithfulness, Gaussian error distribution, and equal error distribution conditions. A key component of the algorithm is componentwise ordering and parent estimations, where both problems can be efficiently addressed using l(1)-regularized regression. This paper proves that sample sizes n = Omega(d(2) log p) and n = Omega(d(2)p(2/m)) are sufficient for the proposed algorithm to recover linear SEMs with subGaussian and (4m)-th bounded-moment error distributions, respectively, where p is the number of nodes and d is the maximum degree of the moralized graph. Further shown is the worst-case computational complexity O(n(p(3) + p(2d2))), and hence, the proposed algorithm is statistically consistent and computationally feasible for learning a high-dimensional linear SEM when its moralized graph is sparse. Through simulations, we verify that the proposed algorithm is statistically consistent and computationally feasible, and it performs well compared to the state-of-the-art US, GDS, LISTEN and TD algorithms with our settings. We also demonstrate through real COVID-19 data that the proposed algorithm is well-suited to estimating a virus-spread map in China.N

    Statistical Road-Traffic Noise Mapping Based on Elementary Urban Forms in Two Cities of South Korea

    No full text
    Statistical models that can generate a road-traffic noise map for a city or area where only elementary urban design factors are determined, and where no concrete urban morphology, including buildings and roads, is given, can provide basic but essential information for developing a quiet and sustainable city. Long-term cost-effective measures for a quiet urban area can be considered at early city planning stages by using the statistical road-traffic noise map. An artificial neural network (ANN) and an ordinary least squares (OLS) model were developed by utilizing data on urban form indicators, based on a 3D urban model and road-traffic noise levels from a normal noise map of city A (Gwangju). The developed ANN and OLS models were applied to city B (Cheongju), and the resultant statistical noise map of city B was compared to an existing normal road-traffic noise map of city B. The urban form indicators that showed multi-collinearity were excluded by the OLS model, and among the remaining urban forms, road-related urban form indicators such as traffic volume and road area density were found to be important variables to predict the road-traffic noise level and to design a quiet city. Comparisons of the statistical ANN and OLS noise maps with the normal noise map showed that the OLS model tends to under-estimate road-traffic noise levels, and the ANN model tends to over-estimate them

    A Projection of Extreme Precipitation Based on a Selection of CMIP5 GCMs over North Korea

    No full text
    The numerous choices between climate change scenarios makes decision-making difficult for the assessment of climate change impacts. Previous studies have used climate models to compare performance in terms of simulating observed climates or preserving model variability among scenarios. In this study, the Katsavounidis-Kuo-Zhang algorithm was applied to select representative climate change scenarios (RCCS) that preserve the variability among all climate change scenarios (CCS). The performance of multi-model ensemble of RCCS was evaluated for reference and future climates. It was found that RCCS was well suited for observations and multi model ensemble of all CCS. Using the RCCS under RCP (Representative Concentration Pathway) 8.5, the future extreme precipitation was projected. As a result, the magnitude and frequency of extreme precipitation increased towards the farther future. Especially, extreme precipitation (daily maximum precipitation of 20-year return-period) during 2070-2099, was projected to occur once every 8.3-year. The RCCS employed in this study is able to successfully represent the performance of all CCS, therefore, this approach can give opportunities managing water resources efficiently for assessment of climate change impacts
    corecore